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ABSTRACT 

Variations of the antenna primary beam (PB) pattern as a function of time, frequency and polarization form 
one of the dominant direction-dependent effects at most radio frequency bands. These gains may also vary 
from antenna to antenna. The A-Projection algorithm, published earlier, accounts for the effects of the narrow- 
band antenna PB in full polarization. In this paper we present the Wide-Band A-Projection algorithm (WB 
A-Projectioii) to include the effects of wide bandwidth in the A-term itself and show that the resulting algo- 
rithm simultaneously corrects for the time, frequency and polarization dependence of the PB. We discuss the 
combination of the WB A-Projection and the Multi-term Multi Frequency Synthesis (MT-MFS) algorithm for 
simultaneous mapping of the sky brightness distribution and the spectral index distribution across a wide field 
of view. We also discuss the use of the narrow-band A-Projection algorithm in hybrid imaging schemes that 
account for the frequency dependence of the PB in the image domain. 
Subject headings: Techniques: interferometric - Techniques: image processing - Methods: data analysis 



1. INTRODUCTION 

Observations in the radio band offer distinct, and often 
times unique, scientific advantages in probing certain areas of 
astrophysical research (e.g in the detection of the EoR signal, 
studies of the high-redshift universe in general, large-scale 
structure formation, early galaxies, etc.). 

All next generation radio telescopes, many in operation 
now, offer at least an order of magnitude improvement in 
the sensitivity and angular resolution compared to the tele- 
scopes operated in the past decades. The two key instrumen- 
tal parameters which afford such high sensitivities, impact the 
imaging performance and are significantly different from pre- 
vious generation telescopes are: 1) the wide instantaneous 
fractional bandwidths, and 2) larger collecting area. The ef- 
fects of wide instantaneous fractional bandwidths that classi- 
cal calibration and imaging algorithms ignore, lead to errors 
higher than the sensitivity that these new telescopes offer Ex- 
amples, relevant for some of the telescopes already in opera- 
tion include the effects of time and frequency variant primary 
beams, frequency dependence of the emission from the sky 
and antenna pointing errors. The effects of wide fractional 
bandwidth and ionospheric phase screen limit the imaging 
performance below ~lGHz. Additionally, significant varia- 
tions in the shape of the wide-band primary beams (PB) for 
aperture array telescopes leads to errors of similar magnitude. 
All these effects form the general class of problems referred to 
in the literature as "direction dependent effects" or DD effects. 

Both, wide fractional bandwidths and larger collecting area 
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lead to many orders of magnitude increase in the data volume, 
putting severe constraints on the run-time performance of the 
algorithms for calibration and imaging. Furthermore, the cost 
of software development and maintenance also scales with al- 
gorithm complexity. Efficient algorithms to simultaneously 
account for all time-, frequency- and polarization-dependent 
DD effects which can also process large data volumes without 
significantly increasing algorithmic and software complexity 
are required. 

In the following sections we discuss various possible ap- 
proaches to full-beam wide-band continuum imaging. We 
present a modification of the A-Projection algorithm which 
we call the Wide-band A-Projection, or WB A-Projection al- 
gorithm where a modified A-term also compensates, to a large 
extent, the frequency dependence of the PB. We also discuss 
the use of the unmodified A-Projection algorithm (Bhatna^ 
|gar et al.|2008| henceforth referred to as Paper-I), which we 
call the Narrow-band A-Projection, or NB A-Projection, along 
with various forms of image-plane normalizations for wide- 
band continuum imaging and the resulting issues and limita- 
tions. 



2. THEORY 

Using the notation developed by |Hamaker et al. ( |1996 1, 
full polarimetric measurements from a single baseline cali- 
brated for the effects of direction-independent gains, can be 
described by the following Measurement Equation 



^^''''(y,t)^Wu(v,t) 



/ 



IJ ^ ' 



,t)f(s,v)e'^-'-^ds (1) 



where V^'^ are the observed visibility samples measured by 
the pair of antennas designated by the subscript / and j, sep- 
arated by the vector bij and weighted by the measurement 
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weights Wij. p. is the radio-Mueller matrix in the image 
domain representing the full polarization description of the 
antenna primary beams as a function of the direction s, fre- 
quency V and time t and / is the image vector The vectors V 
and / are full polarization vectors in the data and image do- 
main respectively, f ■ ^ and / are the unknowns in this equa- 
tion. 

Equation [T] cannot be directly inverted as, in general, it is 
not a Fourier transform relation. It is also sampled only at 
a limited number of points, and therefore the data has in- 
sufficient information to allow an exact solution. Estima- 
tion of / is there fore typically done via iterative n on-linear 



Using the notation described above, the x^ can be written 



;i^^ -minimization (Cornwell 1995 Rau et al. 2009| l. Below 



we briefly review the theory of imaging with A-Projection to 
correct for the time and polarization dependence of P ' in 
narrow-band imaging and motivate the need for a Wide-band 
A-Projection algorithm to also correct for frequency depen- 
dence of P ■ in wide-band imaging. 

2.1. Imaging with A-Projection 

To clarify the full-polarization nature of the A-Projection 
algorithm, we define the outer-convolution operator and de- 
note it by the symbol ®. The outer-convolution operator is 
similar to the outer-product ope ration used in the dir ection- 
independent (DI) description of Hamaker et al. (11996) with 
a minor difference. The element-by-element algebra of the 
outer-convolution operator is the same as that of the outer- 
product operator, except that the complex multiplications in 
outer-product are replaced by convolutions. Using the outer- 
convolution operator and the sub-scripts / and /' to explicitly 
denote the antenna pair for baseline / - j, the A-matrix used 
in A-Projection at a frequency v and time t can be written in 
terms of antenna based quantities as 



Aij = J,®J* 



where 



J, 



I 



E"- 

1 

Ef 



(2) 



(3) 



E'^ and E'' are the polarized antenna aperture illumination pat- 
terns for the two polarization states. The off-diagonal terms 
are the leakage patterns. A^ is the DD equivalent of the 4x4 DI 
Mueller matrix for a given antenna pair The elements of Aij 
are the complex convolution of the two antenna aperture illu- 
mination patterns ( E'^ • E^ j, (E'^ • E''^'' j, etc. For compar- 
ison, the elements of J,® J* would be (Ef ■ E'j'), (Ef ■ E^"«*), 
etc. 

To keep the notation simple, in the following description 
we use a single sub-script k = (ij, v, t) to refer to a mea- 
surement from a single baseline ij, at a spot frequency v and 

an instant in time t. The vectors V and 6 are full polariza- 
tion vectors whose elements are 2D functions in the visibility 
plane (the uv-plane). Elements of V are the 2D visibility data 
and elements of 6 are 2D Delta functions representing the uv- 
sampling function for the data sample k. The super-scripts 
obs, M and o refer to the observed, model and true values 
respectively. 

' This matrix as used in radio interferometric literature differs from that 
used in the optical literature only in that in radio it is written in the polariza- 
tion basis (circular or linear polarization) while in the optical literature it is 
written in the Stokes bas is. These radio and op tical representations are related 
via a Unitary transform JHamaker et al.|1996|. 



as 
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where V^ - V^*' - Vj^ and At is inverse of the noise covari- 

ance matrix. The vector V'^' can be expressed in terms of A 
as 

i^o*^ = (A°*y°)<5, (5) 

Note that, as mentioned before, the elements of A°, V° and 
Sk are 2D functions. The symbol '•' represents the element- 
by-element convolution. V° - without a sub-script - repre- 
sents the true continuous Coherence function. V^'*' represents 
a sample of this Coherence function measured at the parame- 
ters represented by sub-script k. 

The calibration matrix for Eq.|5]to correct for the effects of 
A° is A° ' given by 



A° = 



adj{Al) 
det[A°i^ 



(6) 



The equivalence between Eq. [6] as a generalized direction- 
dependent (DD) calibration and standard direction- 
independent (DI) calibration is discussed in more detail 
in section 12.1.21 

As in DI calibration where calibration is done by the appli- 
cation of the inverse of the appropriate Mueller matrix, cor- 
rection for the effects of A requires the application of A '. 
The difference between DI and DD calibration is that while 
the operator for the application of the DI calibration matrix 
to the data is the matrix multiplication operator, for DD cal- 
ibration this operator is the element-by-element convolution 
operator (the • operator in Eq.lSl). 

Since DD calibration fundamentally cannot be separated 
from imaging, the application of the A"' matrix is done via 
i\\t A-Projection algorithm. This is achieved in two steps. The 
term in the numerator of Eq. l6] adj (At), is applied during re- 
sampling of the observed data (the right hand side of Eq. l5]l 
on a regular grid using convolutional gridding with A 






a 



model of A^ , as the convolution function. The resulting grid- 
ded data is accumulated in the data domain and then Fourier 
transformed to compute the continuum image. The scaling by 
the denominator of Eq. p is done by also accumulating A^ 
and diving the image by its Fourier transform. The resulting 
image using A-Projection is given by 



Ei:kadj{A^)-k{Al*V°)6k 
det{F[ZkAf]) 



(7) 



This effectively applies the DD calibration operator A ' and 
corrects for its effects, provided A^ is a close enough approx- 
imation of A°. Further details, results and discussion on the 
imaging performance of A-Projection are in Paper-I. 

For continuum imaging, the accumulation for all k in Eq. |7] 
can be done in either domain (data or image domain). Con- 
tinuum imaging of wide-band data using the MFS approach 
is done by accumulation in the data domain. Since the A- 
Projection algorithm does not explicitly account for the fre- 
quency dependence of A, an algorithm to project-out this fre- 
quency dependence before accumulation is required. The WB 
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A-Projection algorithm for this is described in sectionIS] Vari- 
ous hybrid imaging algorithms using the NB-A-Projection al- 
gorithm and accumulation in the image domain are also pos- 
sible. These are discussed in section |4] 

2. 1 . 1 . Algorithmic steps for A-Projection 

For completeness and as a reference for later discussions, 
the algorithmic steps for MPS imaging using i\\t A-Projection 
algorithm described in Paper-I are repeated below: 

1 . Initialize the model and the residual images 7*^ and /* 

2. Major cycle: 

• Predict the model data and accumulate A/, as: 

k 



T/Ofci 

^k 



fM 



• Compute the residual data Vf 

• Use Eq.lTlto compute the continuum residual im- 
age as /* = (F Y.k Vk) /det (FA*^) 

3. Minor cycle: Invoke the appropriate minor-cycle algo- 
rithm using P to solve for image-plane parameters and 
update the model image P^. 

4. If not converged, go to Step|2] 

Since Aj does not change from one major cycle to another, 
accumulation of A*^ is done in the first major cycle and cached 
for use in subsequent major cycles. 

2.1.2. A-Projection: A direction-dependent gain correction 
algorithm 

The antenna illumination pattern is essentially a direction- 
dependent description of antenna based complex gains in the 
data domain. Aj. is a direction-dependent generalization of the 
G-Jones matrix - the direction-independent Mueller matrix 



for antenna gains in the Hamaker et al. ( 1996| formulation 
and since the A-Projection algorithm corrects for the effects 
of Alt, it can be thought of as an algorithm for DD calibration. 
To establish the equivalence between DD corrections via 
A-Projection and DI antenna-based complex gain correction, 
we note that Eq. ISlis the DD equivalent of DI measurement 
equation given by: 



•J 



G.J 



v°. 

•J 



(8) 



A^^ in Eqs.l2]andl5]is the DD equivalent of G,y and the outer- 
convolution C®') and the '•' operators are the DD equivalent 
of the outer-product ('®') and element multiplication. Cali- 
bration for Gij is done by multiplying Eq. plby Gr.' given by 



(^iJ 



(9) 



For calibration, Eq.l6]is the DD equivalent of the above equa- 
tion for the DD calibration. 

For an intuitive understanding, we note that for the sim- 
pler case where the off-diagonal elements of J are negligible, 
adjikk) = A"'" wddet(Ak) = trace{Ak)^El''E'f . For this 



simpler case, examination of Eqs. [6] and [9] shows the equiva- 
lence between DI gain calibration and the A-Projection algo- 
rithm more clearly. The process of imaging using the adj (Ak) 
and normalizing the resulting image by det{Y^k ^k) therefore, 
even for the general case, is the DD generaUzation of the 
antenna-based complex gain calibration. 

3. THE WB A-PROJECTION ALGORITHM 

adj (Ak), when used in the A-Projection algorithm, corrects 
for the polarization and other DD effects that can be encoded 
in the phase of A° (e.g. time-varying gains due to polarization 
squint, ionospheric phase-screen, etc.). It is however not a 
conjugate operator for variations along the time or frequency 
axis. 

The image domain effects of the time varying gains are 
largely in the amplitude scaling only (i.e., they do not dis- 
perse the flux in the image domain). Since the minor cycle 
algorithms typically assume that the sky brightness distribu- 
tion is time-invariant and do not parametrize the model image 
in time, the effects of such variations can be ignored in the 
transform from data to image domain. If the deviations from 
the average value are small (e.g. for antenna arrays and long 
integrations where time variability is cyclic, or antennas with 
three-axis mounts where beams do not rotate on the sky) the 
model prediction stage, which properly includes these effects, 
corrects for time variability in an the iterative deconvolution 
scheme. 

Frequency dependence of A° also varies with time (and di- 
rection). This variation is not cyclic and its maximum devi- 
ation from the average value increases with fractional band- 
width. Due to this, as for time varying gains, while the fre- 
quency dependence of A° in the inner part of the main lobe of 
the PB can also be corrected via the model prediction stage, 
the convergence is significantly slower (requiring more ma- 
jor cycles and hence higher computing). Alternatively, this 
frequency dependence can be absorbed, to some extent, in 
the multi-term MFS (MT-MFS) minor cycle algorithm which 
solves for the time-invariant frequency dependence in the im- 
age plane. But this requires more Taylor-terms, which also is 
inadvisable (see section |4.3.3| l. 

Correction for the time-variable frequency dependent ef- 
fects of A° requires a wide-band version of the A''^ matrix 
such that adj {Af\ • A^ results in a function which does not 

vary with frequency (i.e., adjlA'^\ is also a conjugate op- 
erator for frequency). The frequency dependence will then 
be projected-out prior to accumulation, resulting in an image 
corrected for the frequency dependent effects of A. 

3.1. The wide-band A operator 

For the reasons givens above, as well as to keep the pa- 
rameters for modeling the instrumental effects in the data do- 
main separate from parameters for modeling the sky bright- 
ness distribution, we need to construct the A»(v) matrix which 
projects-out the dependence on frequency during imaging. 
One such possibility is the following: 



A.(v) = F- 



Peffiy)P:ff(y) 



P(v) 



(10) 



where P(v) - F' A(v) and Pefjiy) is the desired effective PB 
which varies minimally with frequency. In the data domain, 
adj (At(v)) • A(v) can be shown to be frequency-independent 
to high orders, and this use of A»(v) as a model for A(v) in the 
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Fig. 1 . — The plot in the left panels shows the one dimensional cuts through the PB model at a reference frequency (continuous red line) for the VLA and its 
first (dashed blue line) and second derivatives (dash-dot green line) with respect to frequency. The plot on the right shows the same cuts through the Pgff{v) - 
the effective frequency-independent PB. 



reverse transform can correct for the frequency dependence of 
A(v). However it also has a large support size, and therefore, 
in itself, is not an efficient reverse transform operator. 

3.2. The Conjugate frequency 

Equation[TO]is valid for any frequency dependence. For the 
special case of PB scaling with frequency, we explore usable 
approximations for A*, we define the conjugate frequency v«, 
given by^: 

^~ " (11) 



v« 






where Vref is the reference frequency of the continuum image, 
and examine the effects of choosing A»(v) = A(v»). 

Using the same model for A as used in Paper-I (i.e. a model 
for the VLA antenna PB), one dimensional cuts through the 
model PB given by Pm{v) - FA(v), the effective PB given by 
Peff(y) = F [A(v.) • A(v)] / [FA(v,.j,/)J and the first and sec- 
ond derivatives of Peff{v) with respect to frequency are shown 
in Fig.fT] A comparison of the derivatives of P(v) and Pefjiy) 
with frequency, shown in the two panels as blue dashed lines, 
shows that the effective PB is frequency-independent to the 
first order. While it changes in structure, the maximum sec- 
ond derivative remains almost the same in magnitude. These 
figures show that the approximation in Eq. 1 1 and use of A(v,) 
is good enough for imaging data which are not sensitive to the 
higher order frequency dependent effects. This approximation 
is useful since it can be easily implemented, is appropriate for 
the sensitivity of current telescopes and covers a large frac- 
tion of scientific observations for simultaneous Stokes-I and 
spectral index mapping. The frequency dependence in A(v) is 
reduced overall by an order of magnitude. When used in the 
A-Projection algorithmic steps in section 2.1.2| it effectively 
corrects for the frequency dependence of the PB prior to the 
accumulation along the frequency axis in Eq. |7] For future, 
more sensitive telescopes which will be sensitive to second 
order frequency dependent effects also, A» as in Eq. 



10 



may 

be required. However, when software implementation itself 
requires partitioning along the time and/or frequency axis, hy- 
brid approaches discussed in sectionfflmay be more efficient. 
In practice, there might be multiple terms that make up the 
A-term, some of which may scale with frequency while oth- 

- This expression is arrived at by using a gaussian approximation for the 
PB and imposing the condition that P(v»)P(v) = Peffiv) 



ers may not. E.g. Aperture Illumination will scale with fre- 
quency, but pointing-offset term or resonances effects will not. 
The terms that do not scale with frequency can be applied as 
multiplicative terms to A(v») during gridding. We have tested 
this approach to work for parallel hand polarization effects. 
While we have not tested it, we think this may also work for 
cross-hand polarizations. 

To avoid confusion and for brevity, we will refer to the al- 
gorithm in Paper-I as the Narrow-band A-Projection or NB 
A-Projection algorithm, and refer to the use of A(v,) (instead 
of A(v)) for the data-to-image domain transform as the WB 
A-Projection algorithm. 

3.3. Algorithm Validation 

The image deconvolution algorithm described in section l3] 
was tested using simulated wide-band data with 66% frac- 
tional bandwidth. The VLA C-array was used for antenna 
configuration and the observations covered Hour Angle range 
of ±3''. The model for the PB used in Paper-I was scaled 
by frequency and rotated with Parallactic Angle to simulate 
time-varying frequency dependent effects. To clearly high- 
light the effects of time and frequency dependence of A, we 
used a model of the sky consisting of five point sources lo- 
cated at 0.99, 0.83, 0.60 and 0.11 levels of the PB within the 
main lobe and one source located in the first side lobe (PB 
gain of 0.025). All the point sources were assigned a flux of 1 
Jy with flat spectra. The effective spectral-indices due to the 
primary-beam at the five locations are -0.026,-0.38, -1.0,-5.32 
and -1-0.47 respectively. No noise was added to these simula- 
tions, and all imaging and deconvolution runs were with a 
loop-gain of 0.2. 

Figurel2]shows deconvolved images produced without (first 
panel) and with (second panel) WB A-Projection gridding. 
This comparison demonstrates that with an accurate model of 
the Primary Beam, it is possible to correct-for its time- and 
frequency-variability down to numerical precision levels in 
wide-band wide-field imaging. 

4. APPLYING NB A-PROJECTION FOR WIDE-BAND IMAGING 

NB A-Projection was designed for a single reference fre- 
quency and does not automatically account for the frequency- 
dependence of P during gridding. However, it can still be used 
for wide-band imaging, as long as the frequency dependence 
of the far-field pattern is known and characterized by Py. Sev- 
eral algorithmic options exist, all with different numerical ap- 
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Fig. 2. — This figure shows imaging performance before and after applying coiTections for the time and frequency dependence of the PB during imaging. The 
sky is assumed to have a flat-spectrum, and standard MFS imaging is done. Both restored images are shown at the same gray-scale, stretched to emphasize 
artifacts. Contours are drawn at the 0.02, 0.1 and 0.5 (HPBW) levels of the time-and-frequency averaged Primary Beam. No noise was added to the simulated 
visibilities, in order to clearly illustrate the noise-like artifacts produced by time-variable DD-effects. 

LEFT : Standard MFS-imaging and deconvolution, using a prolate-spheroidal gridding convolution function. Dominant errors are due to the time and frequency 
variability of the PB. Ofi'-source RMS : 4 X 10"" 7v, Peak Residual : 1.8 X lO^Vy 
RIGHT : MFS-imaging and deconvolution, using WB A-Projection to account for both time and frequency variability during gridding. Off-source RMS : 



n-^ 



-7 J 



1.5 X 10"'7y, Peak Residual : 7 X 10"'7v 

proximations and computing load. This flexibility allows the 
implementation to be tuned according to the available com- 
puting resources, architecture of the hardware platform and 
the desired imaging accuracy. 

4.1. Cube imaging + Cube deconvolution 

The simplest approach is spectral-cube imaging where each 
frequency channel (with its limited uv coverage) is treated 
separately. NB A-Projection is applied as is per channel, and 
the minor cycle run independently per channel. A continuum 
image is later constructed by adding together deconvolved and 
restored images from all channels. 

The residual image per channel can be approximately re- 
written from Eq.jTJin the image domain, for the case where the 
aperture illumination functions are identical for all baselines 
and times, and only one polarization-pair is being imaged. 



/« = 



p.-(/r^*(p.-/f^)) 



(12) 



jpsf 



where I'"^ - F Yik ^k,v (k = ij, t) is the point spread function 
for one channel and Py = F 'Ay. In this expression, the divi- 
sion by Py~ implies a flat-sky^ normalization, but aflat-noise* 
normalization may be used instead. 

^ Flat Sky normalization : P- in the denominator of Eq. |l2| gives a resid- 
ual image in which the peak brightness is free from the primary beam but the 
noise level is position dependent. /'^ does not strictly follow a convolution 
equation, and may require shallow minor cycles if the PSF is not well be- 
haved. The output model image from the minor cycle represents only the true 
sky /■'*>'. 

'^ Flat Noise normalization : P in the denominator of Eg. |l2| instead of 
P gives a residual image representing the signal-to-noise ratio at all pixels. 
Also, /'^ satisfies a convolution equation, allowing for deeper deconvolution 
in the minor cycles. The model image will however represent P ■ /'*■', and a 
post-deconvolution division of this model by P will be required. 



This method is straightforward and will suffice for modest 
imaging dynamic ranges and uncomplicated spatial structure 
(point sources). However, the angular resolution of the contin- 
uum image and any estimate of the sky spectrum will be lim- 
ited to that of the lowest frequency in the band. Also, recon- 
struction uncertainties may be inconsistent across frequency 
when there is insufficient Mv-coverage per channel or compli- 
cated spatial structure, leading to spurious spectral structure. 

In this paper, we are focusing on imaging problems that re- 
quire more accuracy and dynamic -range than what the above 
offers. In the next two sections, we discuss A-Projection in the 
context of multi-frequency-synthesis (MFS) where the com- 
bined Mv-coverage is used for model reconstruction (minor cy- 
cle). 

4.2. Cube imaging + MFS deconvolution 

The simplest extension of NB-A-Projection for multi- 
frequency synthesis is to grid, Fourier transform and nor- 
malize each frequency channel (or sets of channels) indepen- 
dently, and then produce a continuum image by an image- 
domain accumulation, before the minor cycle. When the sky 
spectrum is not flat, or when flat-noise normalization is used 
per channel, a wide-band minor-cycle algorithm such as MT- 
MFS can be applied to simultaneously solve for the sky in- 
tensity and spectrum. Taylor-weighted residual images are 
constructed as follows, before proceeding to the minor cycle. 



^f-Z<^v=Z' 



'Py{lyP'f*{PyI, 



sky\ 



(13) 



where w[, - (^-7^) are weights that represent Taylor polyno- 
mial basis functions (Rau & Cornwell|2011| l. The interpreta- 
tion of the output Taylor coefficients depends on the choice of 
normalization as follows. 
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Fig. 3. — These figures compare the imaging performance before and after applying connections for the time and frequency dependence of the PB during 
imaging. All restored images are shown at the same gray-scale, stretched to emphasize artifacts. Conto urs a re drawn at the 0.02, 0.1 and 0.5 (HPBW) 
levels of the time-and-frequency averaged Primary Beam. Results from four algorithms described in Sec. |4.3| are compared here (MFS+SI, MT-MFS+Sl, 
MT-MFS+A-Projection, MT-MFS+WB A-Projection). RMS and peak-residuals are listed in the table below. 



Panel 


Algorithm 


Description 


RMS 

(Jy/beam) 


Peak Residual 
(Jy/beam) 


Comments 


Top Left 


MPS + SI 


Standard Wide-band Imag- 
ing 


6 X 10-* 


2.3 X 10-3 


Ignore time & frequency dependence. Artifacts 
due to time and frequency variations of the PB. 


Top Right 


MT-MFS+ SI 


Multi-term Imaging with 
Standard Giidding 


1 X 10-* 


5 X 10-^ 


Ignore time dependence. Absorb time-averaged 
frequency dependence in MT-MFS. Artifacts 
due to time- variability of the PB. 


Lower Left 


MT-MFS+ A- 
Projection 


Multi-term Imaging with 
NB A-Projection gridding 


4 X 10-5 


8 X lO-'* 


Account for time variability of PB, and absorb 
the resulting PB frequency dependence in MT- 
MFS. Artifacts due to stronger spectral struc- 
ture. 


Lower 
Right 


MT-MFS+ WB A- 
Projection 


Multi-term Imaging with 
wide-band A-Projection 
gridding 


3.5 X 10-5 


2 X lO-'* 


Account for PB time- & frequency-dependence 
in WB A-Projection. Account for static sky- 
frequency dependence in MT-MFS. Minimal ar- 
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1 . W\\h flat-sky normalization per channel, the output Tay- 
lor coefficients will represent ll ' . 

2. For flat-sky normalization per channel, but a regular- 
ized flat noise minor cycle, the Taylor- weighted resid- 
ual images can be multiplied by a Pref after the fre- 



quency summation and before deconvolution. The out- 
put Taylor coefficients will represent Prefll * and a 
post-deconvolution division by Pref will be required for 
the intensity image. No corrections are needed for the 
spectral index map. 
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Fig. 4. — This figure compares the accuracy of the PB-corrected intensity (LEFT) and spectral-index (RIGHT) for the five simulated point sources, using the 
four methods whose results are shown in Figurels] The labels "AWP" and "WBAWP" are used for A-Projection and WB A-Projection in the figure. The algorithms 
compared are MFS+SI, MT-MFS+Sl, MT-MFS+A-Projection and MT-MFS+WB A-Projection. Spectral-indices are shown only for methods using MT-MFS, with 
post-deconvolution (average) spectral-index corrections done for the SI and A-Projection runs. Results for the five sources are shown from left to right with 
increasing distance from the pointing-center. The reference-PB gain and eft'ective PB-spectral-index at the locations of the five sources are listed on the x-axis. 
These plots show that outside the HPBW at the reference-frequency, methods that do not account for time-variable PB-spectra have considerably higher errors, 
and the combination of MT-MFS+WB A-Projection delivers accurate corrections even out in the sidelobe. 



3. With flat-noise normalization per channel before 
Taylor-weighted averaging, the output Taylor coeffi- 
cients will represent Pyly , and a post-deconvolution 
polynomial division of the primary beam spectrum will 
be required to correct both the intensity and the spectral 
index. 

Qualitatively, the reverse transform with^af-.?^ normaliza- 
tion per channel followed by frequency-averaging and a flat- 
noise regularization by Pre/ before the minor cycle, is equiva- 
lent to the use of WB A-Projection during gridding. It requires 
multiple image grid planes (one per channel), FFTs and has 
repeated beam divisions and multiplications that can increase 
numerical errors. However, it is naturally parallelizable, mak- 
ing it an attractive option for extremely large data sets and 
imaging goals where inaccuracy in low gain regions of the 
primary beam can be tolerated. 

4.2. 1 . MFS imaging + MFS deconvolution 

To optimize on memory use and FFT costs (especially in a 
non-parallel imaging run), MFS gridding can be done, where 
averages over baseline, time and frequency are accumulated 
onto a single grid, followed by a single FFT and normalization 
by an average primary beam (or its square). Here, attention 
must be paid to the consequences of averaging over frequency 
before normalization. The use of adj (A(v)) as the gridding 
convolution function (the NB -A -Prq/'ecf /on algorithm), intro- 
duces an additional frequency dependence Py that gets aver- 
aged over before it can be removed. Once gridded, this extra 
frequency dependence is locked in, and can be accounted for 
only as an artificially steeper spectrum in the minor-cycle of 
the MT-MFS wide-band imaging algorithm. 

Multi-term Taylor-weighted residual images must be con- 
structed as follows. 



Zjv "v 



(14) 



and the output Taylor coefficients will depend on the choice 
of normalization as follows 



1. For flat-sky normalization, the model intensity repre 

ref 



the sky spectrum and the square of the primary-beam 
spectrum. 



2. For flat-noise normalization, the model intensity repre- 
sents f/fPavt; and the spectrum represents the product 
of the sky spectrum and the square of the primary beam 
spectrum. 

In both cases, appropriate wide-band post-deconvolution cor- 
rections for the average primary beam and two instances of 
the primary beam spectrum, must be applied. 

Such corrections are inelegant, and are susceptible to nu- 
merical instabilities in low gain regions of the primary beams. 
Section 4.3 shows a comparison of some of these methods 



sents /l/^, but the spectrum represents the product of 



with WB A-Projection, for a simulation with source spectra 
that are not flat and therefore require MT-MFS imaging and 
deconvolution. 

4.3. Comparison of Hybrids with WB A-Projection 

To test the algorithm described in sections [3] and 3.1 with 
non-flat source spectra, we used the sky brightness distribu- 
tion as in Fig. l2] but assigned a spectral index of ff = -0.5 to 
all sources such that /(v) oc (v/vo)". 

Figure [3] shows deconvolved images produced with 
and without time-dependent and frequency-dependent PB- 
corrections during gridding, emphasizing the different types 
of error-patterns that arise when one or more eff'ects are ig- 
nore d. An image formed from the hybrid method described in 
Sec. 4.2.1 to absorb all frequency-dependence into the minor 
cycle solver is also shown for comparison. Figure HI shows 
Stokes-I and spectral index values for these point-sources af- 
ter PB -correction, to illustrate the accuracy to which differ- 
ent methods are able to recover the true-sky spectral index at 
various locations in the PB. The various methods tested and 
results obtained are described below. 

4.3. 1. MFS + SI (Standard Imaging) 

The image in the top left panel of Fig.plis the result of stan- 
dard Cotton-Schwab Clean with MFS gridding using prolate- 
spheroidal functions as gridding-convolution functions, and a 
flat-spectrum assumption during the minor cycle. 

/« = ;^ {(//■'/ *(p,.//.v))} (15) 
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Time and frequency variability of both the sky and the instru- 
ment are ignored, and for a 66% bandwidth, imaging artifacts 
around all sources away from the pointing-center are dom- 
inated by spectral-effects due to Py present in the data. A 
post-deconvolution division by an average primary beam can 
recover the true source intensity to within a few percent, out 
to the half -power point of the PB, but errors increase with dis- 
tance from the pointing center. 

4.3.2. MT-MFS + SI 

The image in the top right panel of Fig.[3]is the result of the 
MT-MFS algorithm in the minor cycle, with standard grid- 
ding (prolate-spheroidal functions). The minor cycle solves 
for the average intensity and spectrum of I{v)P(v) using a 2- 
term Taylor-polynomial approximation. 

^"^Z^K^"'''*!^"-^"^''))} (16) 

V 

Average PB-spectral effects are absorbed into the sky model, 
and the dominant remaining error is due to the time-variability 
of the primary -beams. A post-deconvolution correction of the 
continuum intensity and spectral-index are accurate to within 
a few percent in intensity and ±0. 1 in spectral index out to ap- 
proximately the half -power point. Beyond this field-of-view, 
errors increase (to ±0.4 or more in spectral index) primar- 
ily because a time-averaged primary-beam spectrum is not a 
good estimate in regions of the image where Py changes by 
100% with time as the beams rotate on the sky. 

4.3.3. MT-MFS + A-Projection 

The image in the bottom left panel of Fig. l3] is the result 
of MT-MFS in the minor cycle (2 terms), but w ith the NB 
A-Projection gridding as described in Sec. |4.2.1| with a flat- 
noise normalization before the minor cycle, followed by a 
post-deconvolution correction of the intensity by Pyej (aver- 
age primary beam) and the spectral index by twice that of 
the primary beam. Artifacts due to frequency-independent 
time-variability (antenna rotation) no-longer exist within the 
HPBW (PB gain of 0.5), but new spectral artifacts appear 
away from the pointing-center (beginning around the 10% 
level). 

These errors are partly due to the increased non-linearity of 
the P^iy) spectrum away from the pointing center, for which 
a two-term Taylor-polynomial approximation is insufficient. 
A run with 3 terms partially reduces this problem, indicat- 
ing that errors in approximating the combined spectrum with 
a low-order polynomial dominates the errors, but higher or- 
der polynomials are inadvisable because of instability in low- 
SNR regions. 

Errors also arise from the high time variabiUty of the PB- 
spectrum, which is ignored because only time-averaged spec- 
tra are used for spectral-correction. A post-deconvolution cor- 
rection of the spectral-index map for P^iy) results in errors at 
the ±0.3 level beyond the ~50% point. 

4.3.4. MT-MFS + WB A-Projection 

The image in the bottom right panel of Fig.|3]is the result of 
MT-MFS in the minor cycle (two terms), and WB A-Projection 
gridding (section l3]l. Artifacts around all sources are gone, 
and the dominant errors are numerical (at the floating-point 
precision level). The spectral-index map produced by MT- 
MFS is accurate to within 0.01 in the main lobe, and 0.05 
out in the sidelobe. Such accuracies allows the recovery of 



source-spectra further-out in the primary-beam than previ- 
ously possible. The main difference between this method and 
all others, is that time and frequency variability of the pri- 
mary beam has been corrected for in the data domain, before 
any averaging is done to construct a continuum image to send 
to the minor cycle. The minor cycle sees a flat-noise normal- 
ization, preserving the convolution-equation and allowing for 
deeper 'cleaning' before triggering the next major cycle (i.e. 
faster convergence). 

This method shows the lowest errors in Fig. HI indicating 
that if the primary beam can be accurately modeled, its time 
and frequency variability can be corrected for during gridding, 
resulting in an accurate reconstruction in the minor cycle. 

4.4. Imaging results with VIA L-Band data 

Figure l5] shows the continuum intensity and spectral in- 
dex distribution of the G55.7+3.4 Galactic supernova rem- 
nant (SNR), using the VLA in L-band and D-configuration 
and made with and without the WB A-Projection algorithm. 
The peak brightness is 6mJy, and with an off-source RMS of 
1 1 yiiJy, this is a modest dynamic range. The peak brightness 
comes from a background pulsar with a known spectral index 
of -2.3, and the brighter synchrotron-emission filaments are at 
the ImJy level. The half-power beam width (HPBW) of the 
PB is 30 arcmin, and extended emission from the SNR fills 
the PB at the reference frequency. The spurious spectral in- 
dex at the HPBW due to the primary beam variation between 
1-2 GHz is approximately -1.4. 

The left column of panels in Fig.lSlshows continuum inten- 
sity, and the right column shows corresponding spectral index 
maps. 

1. The top row shows flat-noise results with MTMFS+SI, 
where A-Projection was not used, and primary-beam 
correction was not done. There is considerable artifi- 
cial steepening (darkening on the plot) of the observed 
source spectrum as distance from the pointing center 
increases. The spectral index of the bright background 
pulsar is -3.05. 

2. The middle row shows flat-sky results from the same 
run as above, where MTMFS+SI was followed by a 
post-deconvolution wideband PB correction of both the 
intensity and the spectrum. The spectral index map 
shows that this post-deconvolution correction has re- 
stored the spectral indices of the outer part of the SNR 
as well as the background sources to more realistic val- 
ues. The spectral index of the bright background pulsar 
is -2.61 (after correction for an average estimated PB 
spectral index of -0.44 at the 0.8 gain level). 

3. The bottom row shows flat-sky results from an 
MTMFSh-WBAWP run where the intensity image has 
been corrected for the average PB gain, but the spectral 
index image is just what came out of the imaging run. 
The noise properties of the intensity image are slightly 
better than the middle row, and the spectral index map 
shows slightly more coherent and less noisy structure 
across the SNR. The spectral index of the bright back- 
ground pulsar is -0.29, which is the closest so far to the 
expected value. 

5. CONCLUSIONS 
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Fig. 5. — The left column shows wide-band continuum Stokes-I images of the Galactic SNR G55 imaged with the VLA centered at 1.5 GHz covering the 
frequency range 1.256 - 1.905 GHz. The right column shows the spectral index maps. The top and middle rows are the results from MTMFS+SI imaging, 
without and with a post-deconvolution wideband PB-correction respectively. The artificially high spectral index around the edge of the SNR emission and farther 
out, in the top row, is due to the PB. This is not present in the middle row, but the spectral index map is still noisy. The bottom row shows the fl at-sk y results from 
WnMFS+WBA-Projection with spectral indices closer to their expected values than either of the other methods. Details are discussed in Sec.|4.4| 



In this paper we describe the wide-band A-Projection {WB 
A-Projection) algorithm, which extends the narrow-band A- 



Projection (NB A-Projection) algorithm ( Bhatnagar et al. 
|20081 l to correct for the wide-band effects of the PB prior 
to integration in time and frequency for continuum imaging. 
We demonstrate that the combined WB A-Projection and MT- 
MFS algorithm for simultaneous intensity and spectral index 
mapping performs as expected. 



Theoretical analysis in section 2.1.2 draws equivalence be- 
tween standard antenna complex gam and bandpass calibra- 



tion and the NB A-Projection and WB A-Projection algorithms 
respectively and show that the latter two algorithms are the 
direction-dependent generalization of the former direction- 
independent algorithms. The A-term of the A-Projection al- 
gorithm represents the direction-dependent (DD) complex an- 
tenna gain pattern in the data domain. Since it is direction- 
dependent, corrections for it fundamentally cannot be decou- 
pled from imaging and must be corrected for during imaging 
(see Rau et al. 2009). Tht A-Projection algorithm, which cor- 
rects rorTEe DD antenna gains during imaging, therefore can 
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be thought of as an algorithm for DD calibration. Similarly, 
the WB A-Projection algorithm which includes corrections for 
the frequency dependence of the A-term can be thought of as 
the DD generalization of the standard bandpass calibration al- 
gorithm. We feel that making these connections with simpler, 
intuitively better understood and widely used algorithms in 
the community makes it easier to understand the newer more 
general techniques. 

We also analyzed hybrid schemes for wide-band imaging 
using the NB A-Projection and image-plane correction for the 
effects of PB. Our conclusion is that while for non-parallel 
implementation, WB A-Projection is required for wide-band 
imaging, for implementations which may require partitioning 
the data along time and/or frequency axis, hybrid approaches 
are also sufficient. However the use of WB A-Projection 
in implementations on parallel processing platforms allows 
the freedom to tune the distribution of the data to suite the 
available hardware and computing resources (e.g., this allows 
imaging smaller chunks of the total bandwidth in parallel, 
even if these smaller chunks need wide-band PB corrections. 
Without WB A-Projection, the data distribution is restricted to 
be partitioned in frequency such that each chunk can be im- 
aged using NB A-Projection). 

Comparisons show that the WB A-Projection plus MT-MFS 
enables simultaneous intensity and spectral index imaging 
throughout the PB in wide-field imaging. Moderate dynamic 
range imaging within the half -power point of the PB is pos- 
sible where all frequency dependence in the image is ab- 
sorbed in the solution of the MT-MFS algorithm. Beyond this 
field-of-view (FoV), errors increase because a time-averaged 
primary-beam spectrum is not a good estimate in regions of 



the image where PB changes by 100% with time as the beams 
rotate on the sky. The FoV can be increased till ~ 10% point 
of the PB by combining MT-MFS with NB A-Projection. The 
time-dependence of the PB is accounted for via A-Projection 
and its frequency dependence is absorbed in MT-MFS. Using 
larger number of Taylor-terms in MT-MFS improves the imag- 
ing performance for simpler fields, but is inadvisable because 
of instability in low-SNR regions. 

Finally, we would like to note that while only the effects of 
the antenna PB were included in the A-term used in this paper, 
other antenna-based DD effect can also be easily included. 
The effect of non-isoplanatic ionospheric/atmospheric phases 
is comparable to the effect of PB for wide-band wide-field 
imaging at low frequencies, particularly with aperture -array 
antenna elements. Similar effects come from the irregu- 
larities in the water vapor content in the lower atmosphere 
for imaging at high frequencies. The effects due to iono- 
sphere/atmosphere and PB need to be corrected simultane- 
ously, often for wide-band data in full polarization. It may be 
possible to extend the WB A-Projection algorithm presented 
here to include corrections for ionospheric effects. Work to 
test these extensions is underway and will be reported in fu- 
ture publications. 
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